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CONVERGENCE  OF  THE  qR  ALGORITHM 

ABSTRACT 

The   QR   algorithm  of  J.  G.  P.  Francis  is  used 
in  computing  matrix  eigenvalues.   The  convergence  proof 
given  here  is  an  analogue  of  Rutishauser ' s  proof  of  the 
convergence  of  the   LR   algorithm. 
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NYO-10,  L(-37 
CONVERGENCE  OF  THE   QR   ALGORITHM 

by 
Beresford  Parlett 

1 .   Introduction 

J.  G.  P.  Francis  presented  his   QR   transfor- 
mation in  1961  in  two  articles  [l]  •   We  give  here  a  new 
proof  of  the  main  result,  namely  Theorem  3  in  [1] . 

Francis  proposed  an  algorithm  for  generating  from 
a  given  square  matrix  a  sequence  of  unitarily  similar 
matrices.   He  proved  that  if  the  eigenvalues  have  distinct 
moduli  then,  in  the  limit,  the  form  of  the  matrices  is 
triangular  and,  although  the  sequence  may  not  converge  in 
the  strict  sense,  the  diagonal  elements  do  tend  to  the 
eigenvalues. 

In  practice  an  elegant  modification  of  the  basic 
algorithm  provides  a  speedy  and  stable  method  for  calcula- 
ting matrix  eigenvalues  on  a  digital  computer.   In  this 
paper  we  shall  be  concerned  only  with  the  basic  algorithm. 

The   QR   transformation  is  an  analogue  of  the   LR 
transformation  of  Rutishauser  [2]  .   Although  it  is  more 
complicated  technically,  our  convergence  proof  is  an  ana- 
logue of  Rutishauser ' s  convergence  proof  for  the   LR 
transformation.   The  convergence  rate  emerges  naturally 
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in  the  course  of  the  proof. 

In  the  next  section  we  describe  those  properties  of 
the   QR   transformation  which  are  essential  for  our  proof. 
The  author  would  like  to  thank  Eugene  Isaacson  for  helpful 
suggestions  on  the  presentation  of  the  argument. 

2.   The   QR  Transformation 

Prom  the  original  matrix  A  (=A    ) ,    assumed  non- 

(k) 

singular,  a  sequence   A     is  produced  as  follows.   At 

the   k    stage  A     is  decomposed  into  a  product  of  a 

unitary  matrix  Q  by  an  upper  (or  right)  triangular 

matrix  R     .   Then  A       is  formed  by  post  multiply- 
ing R     by   Q     .   The  factorization  is  always  possible 

by  Theorem  1  in  £lj  and  can  be  accomplished  in  a  stable 

(k) 

manner.   By  requiring  that  the  diagonal  elements  of   R 

be  positive  the  decomposition  becomes  unique.   Thus 
A(1»  =  A 

(21)  A(k)  =  Q(kVk)  ,   A(k+1)  =  R(k)q(k)  ,   k-1,2,... 

It  follows  from  these  definitions  that 

(22)  A(k+1)  =Q<k>Vk-1)*...Q<1,V1>...«lk-1Vk) 

where  M   denotes  the  conjugate  transpose  of  M  .   Following 
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Francis  we  write 


P<k>  =  Q(1)Q(2)...Q(k)    and    S(k)  =  R(  k)  R(  ^  .  .  .R<  *> 


Thus  the   P      are  unitary  and  the   S     are  upper 

triangular.  It  can  be  verified,  by  using  (21)  repeatedly, 

that 

(23)  p(k)s(k)  m   Ak   > 

Equations  (22)  and  (23)  together  give  the  following  useful 
result . 

THEOREM  2  (Francis) .   If  A   is  non-singular  then 
4(W)sp(k)»tf(k)  ^^  p(k)s(k)   la  the  mlt&PY_ 

triangular  decomposition  of  A 

Thus  to  prove  the  convergence  of  \  A   (       as 


{p<*>J  . 


k  — ^  co   it  suffices  to  prove  the  convergence  of 

3«   Determinantal  Representation  of   Q  . 

The  unitary-triangular,  or   QR  ,  factorization  of 
any  non-singular  matrix  M   is  mathematically  equivalent 
to  the  Gram-Schmidt  process  for  orthonormalising  the 
linearly  independent  columns  of  M  .   The  usual  process 
amounts  to  post  multiplying  the  matrix  M   by  an  upper 
triangular  matrix,  actually   R~   ,  to  obtain  a  unitary 
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matrix   Q,  .   The  usual  normalisation  invoked  to  give 
uniqueness  to  the  Gram-Schmidt  process  is  precisely  the 
requirement  that   R   have  positive  diagonal  elements,  see  [3] 
Only  when  there  is  need  to  stress  the  dependence  of   Q, 
and  R   on  M  will  we  write   Q,  =  Q(M)  ,   R  =  R(M)  . 

In  the  proofs  which  follow  we  shall  use  the  follow- 
ing notation.   The  j    column  of  a  matrix  M   is   m.   , 

.th  J 

except  for  the  identity  matrix  whose  j    column  is  denoted 

by   e.  .   Let   <3.(M)   denote  the  leading  principal   j  X  j 

minor  (determinant)  of  M   and  let   y.(M)  =  <5.(M*M)  .   If 

M   is   nXn   then   y  (M)   is  the  Gramian  (determinant)  of  M 

Let  m.   be  the  vector  obtained  by  replacing  the  numbers  in 

the  last  row  of  y -W      by  the  vectors  m-^rr^, m.  .   Thus 


y.(M)  =  det 

J 


mt'm,  . .  .  m'r'm  . 


m  .m.,  00.  m  .m  . 
J  1      J  J 


y  =  1 
'o 


•52- 

j-. 

m-.m 

.00    m,  m . 
1    J 

det 

*«- 

• 
0 

mj-lml 

0.0    m  .    -,  m  . 
J-l    J 

L  ^ 

.  0  .       m  . 

J 

By  expanding  the  latter  determinant  formally  with  respect 


to  the  last  row  we  see  that   m.   is  a  linear  combination 

J 


of  m-,  >mp,  o »  -  ,m . 
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Now  suppose  that   M  =  Qft  .   With  the  notation 
given  above  we  can  give  a  determinantal  representation 

for   q.  (the   j     column  of   Q  ).   This  result  is  given 

11      r  1 
by  Szego  in  |_3J  . 


LEMMA  1 

_  1 

(3D  qj  =  (Yj(M)Yj..1(M))   2  m.. .  . 

The  lemma  follows  from  the  observation  that   m.   is 

J 

orthogonal  to  m-,...,m.  ,   and  that  m.m.  =  y.(M)y.   (M)  . 

This  lemma  is  the  tool  for  our  convergence  proof. 

A  representation  for  the  i  th  element  of  m.   will 

J 

prove  convenient.   For  a  given  value  of   j   denote  by  M.  . 

1  j  J 

the  matrix  M  with  m.   replaced  by   e.  ,  then 

(32)  e.m.  =£,(M,  ,M)  . 

*•  J    J  L  >  J 

if.   A  lemma  on  determinants 

We  shall  need  to  know  the  form  of  the  determinant 
of  a  certain  product  of  matrices.   Let   Z  =ll,...,nj 


n 

and  let   Z  .   be  the  set  of  all  strictly  increasing  se- 
quences of  k  elements  chosen  from  the  set   Z 


n 


LEMMA   2.      Let   A,L,B,M,C      be      nxn      matrices,      L      and     M 
diagonal.      Then 
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(kl)      (S.  (AL3KC)  =   JZZ     g°'P(in  •••/  )(mp  ...bu  )  . 

K         o,peZ  .      al    °k   ^1    "k 
r   n,k 

Proof.   For  a,p€Z  ,   and  E  an  nxn  matrix 
let  E„   be  the  submatrix  obtained  by  selecting  from  E 
rows   a-,  ,  .  .  .  ,a,   and  columns   B-,  ,  . . .  ,  S,  .   Let 
k  =  (l,2,...,k)  ,  then  by  the  theory  of  determinants, 


det(E^LPv )  =   )     det(Ek)det((LP)1a) 
"     a€Zn,k     a         " 

=   Z=det(^det(pJ)(/  ...^  )  . 

a£Zn,k  "    X     k 

Two  applications  of  this  formula  gives  the  result  of  the 
lemma  with 


(M>)        g 


°v  =  detUr,det{BCp)det(C,P)    . 


p;aeb^k, 


COROLLARY.   If  the   kbri   row  of   A   is   e"   then 
/   ±a        factor  of  £,  (ALBMC)  . 

Proof.   For  all   oeZ  ,  ,  k^  c   implies  that  row 
k  of  A   is  null.   Thus  the  coefficients   g0,P  in  (l+O)  vanish 
unless  the  term  involves  x..     .    Q.  E.  D. 

5.   Convergence  of  the  algorithm. 
In  Section  2  we  saw  that  the   QR   algorithm  when 
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applied  to  any  non-singular  nXn  matrix  A  =  A 
produces  a  sequence  )  A^j   in  which  A^k+1^  =  P^*AP( 


and  P     is  the  unitary  factor  of  K  =   Ak  ;  i.e.  P  =  P^  =  Q;Ak). 
Equations  (31)  and  (32)  of  Section  3  applied  to  M   give  a 
representation  of  a  typical  element  of  P  . 


(50)         p, ,  =  e?p,  = 


-  »  ■' 


where  Y-{B)    =    6.{B~B)    ,      the  leading  principal   jXj  minor 
of  3  B  ,   and  M^_  *      is  K     with  column   j   replaced  by   e.  . 

To  exhibit  the  conceptual  simplicity  of  the  proof  of 
Francis'  Theorem  3  we  consider  first  the  special  case  of 
well-ordered  eigenvalues. 

THEOREM  3.   Let   A  =  X/|  Y  where   Y  =  X"1  , 
/\    =   diagU-^  . .  .  ,Xn)  . 

if   (i)    |x1|>|x2|>...>|xn[>o  , 


and   (ii)     6V(Y)  =J=  0  ,   v  =  1,..., n  , 

then ,  as  k  — ^  oo ,  the  elements  of  A  below  the  principal 
diagonal  tend  to  zero ,  the  moduli  of  those  above  tend  to  fixed 
values ,  and   a..   — }     A.  ,   i  =  l,...,n  . 

Proof.   We  examine  first  the  denominator  of  (50) .   Now 
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*  ..       k 

Y-(M)    =    &AY"A  X"X/\    Y)       and   by  Lemma   2   this    is    a    sum 
J  J 

of   terms   of   the   form 


(5D  €U_    ...V,    )     (Xp    ...Xp   ) 

°1  °j  ri  "J 

o,  p  £  Z   .   and  4   depends  on  o   and   P  but  not  on   k  . 
Hence,  by   (i)  , 

(52)  Yj(M)  ^€J.|X1...Xj|2k(l+0(r^))  ,   k  ->  oo  , 

provided  that   £.  f  0  ,   where   r  =  [x.  ,/X.I  ,  r  =  r  =  0 

^  J  J      J+l   J1     o     n 

From  equation  (J+O)  we  have 

(53)  €j  =  YjUJI^jWI2  >  0 

since   y.(X)   is  the  Gram  determinant  of   j   of  the  linearly 

independent  columns  of  X   and   6.(Y)  }=  0   by  (ii)  . 

To  examine  the  numerator  of  (50)  we  observe  that 

k 
(514-)  M.  .  =  X.  .A  •  -Y.  ,  » 

«  *_»k     * 

Thus   6.(M.   M)  =  6 AY.     .A,  JC,  ,XA^)    and  may  be  ex- 

J   -1-*  J       J   J  >  J   JjJijJ 

panded  as  a  sum  of  terms  of  the  form  given  in  (51) •   How- 

■5;-   .     -::- 

ever  row   j   of  Y.  .  is   e.   and  so,  by  the  corollary  of 

J  >   J       J 

Lemma  2,  each  non-zero  term  in  the  expansion  includes  the 
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_  k 

(j,j)   element  of  /\  .     .     ,      which  is   1  ,  and  hence  in- 

J  >  J 

^  j, 

volves  only   j-1   factors   ?v  ,   ,   v  *   j  .   Thus   £.(M.  .Ml 

v  J   i»J 

is  a  sum  of  terms  of  the  form 

k  k 


(55)  OUr  ..-V   )  (Xp  ...Xp  ) 

where   p£  Znjj  ,   T  £  ZQ  ,  1  with   j  /  T,  and  7  =  7(T,p) 
does  not  depend  on  k  .   Hence  by   (i)  ,  as   k  — >  co 


(56)    ^(M.^Ml-V^.  iA1...X._1|2*x:<(l+0<r.k))   . 
provided  that  ?/.  .  =f=  0  .   From  equation  (2+0)  we  have 


(57)        rj  .  =  6.  1(y)6.(x   x)X  (y)  . 

If   £  (M.  *K)  M   then  77   .  =f=  C  ,   by   (ii)  ,  and  we  may 
substitute  (52)  and  (56)  into  (50)  to  obtain 

<58)^+11^^/2(^)k(-^'-^')- 


By  comparing  (53)  and  (57)  with  (31)  and  (32)   we  find 

>?•  •  sgn6\ 

(59)  — ^ ,  ,„  =  -^rr1 


>7jj  sgno'.(Y) 

<«        €)V2  =    .gn6j.1(Y)       <kj(X)       - 
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where   q.  .(X)   is  the   (i,j)   element  of   Q  =  Q(X)  .   By 


(ii)   we  note  that  if  Y[.,    =0   then   qi.(X)  =  0   and, 
though  (58)  fails,  p [ J   — >     0 
of  T] .  .  we  have,  as  k  — )  oo  , 


though  (58)  fails,  p .  .  — ■ ^  0  .   Thus  whatever  the  value 


(60)       |pijk+1)|  -»  qi<3(X);   i,j  =  l,...,n  . 


Since  the  signa  in  (58)  and  (59)  do  not  depend  on 
i   and   since  X  =  QR  we  have,  as   k  — >  oo  , 


(k+1),  _  |    f "   (k)    (k) 


iP 


11  ,*.,*,  .,„*„-!< 


(61)  ->   I   CZ  a  qa.q   I  =   (Q*A*)   |  =  l(R^H-)   | 

a-,  p  =  l 


ll' 


(k+l)   =   £ -  (k)    (k) 


11  s*.^  ,„,„-!' 


(62)  ->  ai^ZIiaapqajqpj  -  (Q'AQ)  ..  =  (*«-)„  =  X.  . 

This  proves  Theorem  3  since  R   is  upper  triangular. 

6.   Convergence  of  the  algorithm  with  disorder 
of  the  eigenvalues. 

We  now  show  how  the  proof  of  Theorem  3  may  be  carried 
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over  with  almost  no  alteration  for  any  non-singular  matrix 
with  eigenvalues  of  distinct  modulus.   The  key  is  to  keep 
hypothesis   (ii)   at  the  expense  of   (i)  . 

For  any   A   the  canonical  form  A  =  X/\Y   is  unique 
only  up  to  a  permutation  PA?     of  /\   and  the  correspond- 
ing permutations  in  the  columns  of  X   and  the  rows  of  Y  . 
We  now  relabel  the  elements  of  /\  so  that  for 
v  =  1,  ,<  -,n;  6   (Y)  =f=  0   and  ^v(=/\vv)   is  a^  eigenvalue  of 
maximal  modulus  such  that  this  is  so.   Since  Y   is  non- 
singular  this  relabelling  is  always  possible. 

We  need  two  lemmas  to  show  that  with  this  new 

labelling  the  dominant  terms  among  those  of  the  form  (51) 

and  iS5)    are  given  by  the  same  expressions  as  before, 

namely  (52)  and  (56). 

We  recall  that  if  a   ,Q  6  Z,_  *   then  YD  denotes 

1     n,  j  y 

the  3>?j      submatrix  involving  rows   a.  ,  .  .  .  , o  .   and 

P-.  ,  .  .  . ,  p.      of  the   nXn  matrix  Y  .   Also  j  =  ( 1 ,  .  .  . ,  j  )  . 

For  given   i,j  ,  the  coefficients  £  and  Y)      in 
(5l)  and  {SS)    are  given  by  equation  (lj.0)  as 

£,(o,p)    =  det(Y°)det[(X*xfn]det(Y^)   , 
Y)(tf)    =   det(Y1_l)det|"(Xl  ^X)^ldet(Y^)   , 


-  Hi.  - 


where   o,p,T'  €  Z   .  ,  f  €   ZQ^  ,x,  j  ^  T,  2T*   is  T  supple  - 

/.ted  with   j   .   From  these  expressions  it  is  clear  that 

p 
if   I X^  .  •  .Xq  |>|X,...X.|   implies   det(Y'.)  =  0   then  all 
'i    '  j  — 

terms  which  might  dominate   E.|X....X.|     and 

Y?.  .|X,  . .  .X  .  ,  I   X.   as   k  — ^  oo  will,  in  fact,  have  zero 
'(ij1  1    j-1'    J 

coefficients . 

LEMKA  3.   Let   V  =  (v.  :  i  =  1,  .  .  .  ,k+ll   be_  a  set  of 

ar.;r  k+1   k-dir.ensicr.al  vectors .   Then  the  subset   U  __ 
those   v.   such  that  the  remaining  vectors  are  linearly 
independent     eitner  empty  or  linearly  dependent. 

Proof.   If  U"   is  not  empty  then  vk+-.  (say)  E  U  . 

k 
v.  ,  =  ]   '  a.  v.;   with  unique  coefficients   a.  .   If  U  =  V 

then  U  is  clearly  linearly  dependent.  If  v,  (say)  f.  U 
then  j  v2,  .  .  .  ,v.+1  i  is  dependent  whilst  f  v2'°*',Vk)  is 
independent  since   vk+1  £  0  .   Thus   vk+1  =  >    ^±v±     with 

unique   p\  .   Hence   a-,  =  0   and,  similarly,   a.  =  0   for 
all   j   such  that   v  .  ^  U  .   Thus  vk+1   is  either  null  or 
linearly  dependent  on  the  remaining  v^  €.   U  .   Q.  E.  D. 

Let   u,  ,....u   be  any  non-zero  numbers  related  to 
^1      n 

a  given  nXn  matrix  Y   be  the  ordering  condition  that 
^  (y)  £  0  ,  v  =  1, . . .,n,   but  for  every  uk  and  o   satisfying 
|u,  |>|u.|   and  c.  =  i  ,  i  <  j  -  1  ,  c ,  =  k  then  det(Y^)  =  0  .   For 
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p 

P £  Z   ..   let  \i        denote  ll   .  .  .ll    .      Ae    can  now  state 

i    pi 

LEMMA  Ll. .   Let;   Y  be_  a  n  /  n  matrix  related  to 

non-zero  numbers   u,  ,  .  .  .  ,u    c~  tr.e  ordering  condition 

given  above .   For    =  l,....n  and  a  €:  Z       .      the  follow- 
ja -  njj 

ing  statement  holds.   If   det(Y  .)  £  C,  o  ^  ,   then 

|  a     •-—  |   v.-  it:-.  t:.-Lj:;-  :  ;  .  ,::i..:  : .-.  1 ; ;  if  :  ;  r  gag  b   0   >  J 

there  exists  J^  <    j   such  that    ..  _   =  ' ll  a\     . 

m      ^ 

Proof.   For   j  =  1,  det(Y?)  =  y_    and,  by  the  order- 

l  1 

ing  condition,  y   £  0   implies   | u  | <| u  |  .   Thus  the  Leans 

is  true  for   j  =  1;   we  now  assume  it  to  hold  for   j  =  k  <  n  . 
For  o  <f  Z  k  -,   let   o(i)   denote   c  without  the 

element   o.  .   Let   r    and  r    denote  row  o    of  the 
1  v         v  v 

matrices  Y,   and  Y,  ,  .   To  prove  that  the  lemma  holds 

for   j  =  k+1  we  assume  that   det(Y,  ,)  £  0  ,   c  ^  >+j  .   Let 


■k+1' 


S  =  [ov:det(Y^(v))    *0  ?• 


By  Lemma   3   the    set     j  vQ    :o.€S  /     is   linearly  dependent. 

^       i      x       ) 

Also,    if      |a      |  =»- 1 ijl t^_.  -,  I      for  each      o.£S      then,    by  the   ordering 

condition,      r  is   linearly  dependent   on  the    independent 

set      )r   ?v  =   l,...,k>  .      Thus  there    are   unique      a....      such  that 
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Since   (*)   holds,  a  fortiori,  with   rv   in  place  of   rv 
it  follows  that  the  linear  dependence  of   )r   :o,€  S( 

implies  the  linear  dependence  of   r   jo^.  6  S   .   This 

contradicts  the  hypothesis   det(Y^+l^  ^  °   and  so  we  can~ 

not  have   |u   ^l^k+l'   f°r  a11   °i  €  S  " 
i 

By  hypotheses   |  uo(  i]  |  <|  \i-\  V  o.  £   S  .   If   |p.°|>|n— I 
then  \\i      [  >  |  M-T.. . -i  I  \f  °  •   £  S  .   Hence  \\i    \<\\l 1   as  was  to  be 

proved.   We  now  examine  the  case  of  equality. 

If   |p.°|  =  |n— |   and   |^o(i)|<|^|  \/a±  S   S   then 
again   |  {i   |  >|  \i .  ,  |  V  a .    6  S  .   Hence  for  at  least  one 

a.  £  S  we  must  have   |  \ia         \    =    |  |JU —  |   and  so,  by  hypothesis, 

either  (&)    o(i)  =  k  or  (b)  for  each  o  £a(i)  ,   o  >k   there 
^ — '        —      w  m  m 

exists  X<k      such  that   |u   |  =  |  \iti\     . 

m 

In  both  cases   |  |X   |  =  I  M-k+-il  • 

°i 

If    (a)     holds   then      c>k+l      since      o   =)=   k+1      and 


l 

o  <k  ,   p  ±   i  .   If   (b)   holds  then,  a  fortiori, 
P-       T 

\vo    I  =  I  H/l   for  °m€o(i)  >  °m>k+1  •   If  °i>k+1   then, 
m 
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since   | M-Cy  I  =  l^k+1 1  »   the  lemma  is  proved  for   j  =  k+1 

in  both  case  (a)  and  case  (b). 

By  the  principle  of  finite  induction  the  lemma  holds 
for  all   j   for  which  it  is  meaningful,  i.e.   j  =  l,...,n. 

COROLLARY.      If      \V±\    =f=    ||i    |     ,       i   f    j      then      |u°|>|m^| 

a . 
implies      det(Y.)    =   0    . 

If  the  matrix  A   has  eigenvalues  of  distinct  modulus 

and  we  order  them  with  respect  to   Y   as  described  above  then 

the  proof  of  Theorem  3  remains  valid  if  we  substitute  the 

phrase  "by  the  corollary  of  Lemma  l\."   for  the  phrase  Mhence 

by   (i)"   in  the  two  places  in  which  it  occurs.   Moreover 

r.   must  be  interpreted  not  as   IX.^A-I   but  as   I  X.A.I 
J  J+1   J  '  y     J1 

t 

where   X.   is  the  eigenvalue  of  maximal  modulus  less  then 
|Xj|  . 

Thus  the  conclusion  of  Theorem  3  is  valid  for  non- 
singular  matrices  with  eigenvalues  of  distinct  modulus. 
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